Quantifying contention and balancing memory load on hardware DSM multiprocessors
نویسنده
چکیده
This paper makes the following contributions: It proposes a new methodology for quantifying remote memory access contention on hardware DSM multiprocessors. The most valuable aspect of this methodology is that it assesses the overhead of contention on real parallel programs running on real hardware. The methodology uses as input the number of accesses from each node of the DSM to each page in memory. A trace of the memory accesses of the program obtained at runtime is used to compute a fairly accurate estimate of the fraction of execution time wasted due to contention. The paper presents also a new algorithm which detects potential hot spots in pages and balances memory load using dynamic page migration. The algorithm attacks indirectly the problem of contention by balancing the remote memory access latency across the nodes of the system. Experiments with five irregular parallel codes on a 128-processor Origin2000 show that the algorithm yields significant performance improvements. r 2003 Published by Elsevier Inc.
منابع مشابه
Quantifying and Resolving Remote Memory Access Contention on Hardware DSM Multiprocessors
This paper makes the following contributions: It proposes a new methodology for quantifying remote memory access contention on hardware DSM multiprocessors. The most valuable aspect of this methodology is that it assesses the impact of contention on real parallel programs running on real hardware. The methodology uses as input the number of accesses from each DSM node to each page in memory. A ...
متن کاملDistributed, low contention task allocation
Designing a good task allocation algorithm faces the challenge of allowing high levels of throughput, so that tasks are executed fast and processor parallelism is exploited, while still guaranteeing a low level of memory contention, so that performance does not su er because of limitations on processor-to-memory bandwidth. In this work, we present a comparative study of throughput and contentio...
متن کاملParallel Sorting by Regular Sampling
A new parallel sorting algorithm suitable for MIMD multiprocessors is presented. The algorithm reduces memory and bus contention, which many parallel sorting algorithms suffer from, by using a regular sampling of the data to ensure good pivot selection. For n data elements to be sorted and p processors, when n ≥ p 3 the algorithm is shown to be asymptotically optimal. In theory, the algorithm i...
متن کاملA Summary of Research in System Software and Concurrency at the University of Malta: Multithreading
Multithreading has emerged as a leading paradigm for the development of applications with demanding performance requirements. This can be attributed to the benefits that are reaped through the overlapping of I/O with computation and the added bonus of speedup when multiprocessors are employed. However, the use of multithreading brings with it new challenges. Cache utilisation is often very poor...
متن کاملA Comparison of Software and Hardware Synchronization Mechanisms for Distributed Shared Memory Multiprocessors
E cient synchronization is an essential component of parallel computing The designers of traditional multiprocessors have included hardware support only for simple operations such as compare and swap and load linked store conditional while high level synchronization primitives such as locks barriers and condition variables have been implemented in software With the advent of directory based dis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Parallel Distrib. Comput.
دوره 63 شماره
صفحات -
تاریخ انتشار 2003